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PRODUCTION OF COMPLEX CARBOHYDRATES 

CLAIM OF PRIORITY 
This application claims priority under 35 U.S.C. 1 19(e) from U.S. 
5 Provisional Application Serial No. 60/134,756, filed May 18, 1999, which 
application is incorporated herein by reference. 

GOVERNMENTAL RIGHTS 
The United States Government retains certain rights in this invention. 
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Diseases under Grant Number A124016 and from the NIH National Center for 
Research Resources under Grants Number RR01614 and RR041 12. 

FIELD OF THE INVENTION 
1 5 This invention relates to a method for the production of complex 

carbohydrates on an LPS backbone structure in Gram-negative bacteria 

BACKGROUND OF THE INVENTION 
Complex carbohydrates occur in nature and are involved in a wide array of 

20 biological functions, including viral, bacterial and fiingal pathogenesis, cell-to-cell 
and intracellular recognition, binding of hormones and pathogens to cell-surface 
receptors and in antigen-antibody recognition. The term "complex carbohydrates" 
embraces a wide array of chemical compounds having the general formula (CH20)n 
where the monomer unit is selected from any of thousands of naturally occurring or 

25 synthetic monomers, including, but not limited to, glucose, galactose, mannose, 
fixcose and sialic acid Saccharides may have additional constituents such as amino, 
sulfate or phosphate groups, in addition to the carbon-hydrogen-oxygen core. The 
polymer consisting of two to ten saccharide units is termed an oligosaccharide (OS) 
and that consisting of more than ten saccharide units is termed a polysaccharide 

30 (PS). These monosaccharide building blocks can be linked in at least 10 different 
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ways, leading to an astronomical number of different combinations and 
permutations. It is found that strains within species and even tissue within an 
organism differ in complex carbohydrate structure. This high degree of variabiUty, 
the highly specific composition of naturally occurring complex carbohydrates and 
5 the wide range of biological roles make these compounds especially significant. 

Gram-negative bacteria contain complex carbohydrates, which are linked to 
lipids to form Upooligosaccharides (LOS) or lipopolysaccharides (LPS.) The 
immunogenicity of the LOS and LPS resides in the carbohydrate moiety, while 
pathogenicity resides in the lipid moiety. For this reason, OS and PS are useful as 
1 0 vaccines against Gram-negative pathogens and for identification of gram-negative 
bacteria. 

United States Patent Application 5,736,533 discloses oUgosaccharides usefiil 
as therapeutic agents against pathogens that are the causative agents of respiratory 
infections. It is believed that pathogenic bacteria are able to colonize tissue by 
15 binding to carbohydrates on the surface of the tissue and that providing an excess of 
specific soluble oligosaccharides can result in.qompetitivgjnhibition of bacterial 
colonization. 

OS and PS fi-om LPS and LOS can be produced by growing the specific 
bacterial pathogen in culture, with subsequent cleavage of the lipid moiety and 

20 purificatioa However, most pathogenic bacteria are fastidious in their growth 
requirements and slow growing, making this mode of production impractical. For 
example, Haemophilus influenzae is known to re quire a c arbon dioxide atmosphere 
and brain/heart extract for growth. Helicobacter pylori grows very poorly in broth 
cultures required for OS and PS productioa In addition, many of these bacterial 

25 pathogens (for example, Neisseria meningitidis) can be dangerous to grow in large 
volumes because of the risk of aerosol and possible infection spread. The ability to 
produce the OS and PS structures of fastidious bacterial pathogens in bacterial 
strains such as Escherichia coli and Salmonella minnesota which grow rapidly to 
high density offers a rapid way to produce these^OS^ndPS firom fastidious bacterial 

30 pathogens. 
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Eucaryotic proteins and peptides frequently have carbohydrate moieties on 
their surfaces, which act as specific binding sites for hormones, which are also 
glycosylated, that is, have complex carbohydrates linked to the peptide structure. 
Moreover, in addition to the recognition role, carbohydrates are necessary to the 
5 proper three-dimensional folding of polypeptides into functional glycoproteins. 
Bacteria do not glycosylate peptides and proteins efficiently or in ajrnanner 
equivalent to that of eucaryotes. For that reason, although bacteria are widely used 
as production cells for growing eucaryotic peptides and proteins, such useful human 
glycopeptides such as erythropoetin are gpvm in mammal United States 

1 0 Patent Number 4.703.008^iscloses a method for the production of ervthropoietin, in 
which cells such as Chinese hamster ovary cells are transfected with the DNA 
coding for the hormone and grown under a carbon dioxide atmosphere in complex 
medium. The resulting hormone is sufficiently similar to the naturally occurring 
hormone to be effective as a therapeutic for human use. 

15 An additional utility for isolated, cell-specific carbohydrates is for 

competitive inhibition of disease agents in which infection is reliant on surjface- 
recognition glycosylated proteins. For example, the human immunodeficiency virus 
is known to bind to the surface receptor on T-4 lymphocytes. If an excess of fi*ee T- 
4 receptor carbohydrate is present in the bodily fluids of the patient, the virus will 

20 bind to the fi-ee carbohydrate and is effectively prevented from infecting the T-4 
lymphocyte. 

Competitive inhibition of binding of antibodies to cell surfaces by 
administration of cell-recognition molecules may have therapeutic potential in the 
treatment of autoimmune diseases such as lupus erythematosus, multiple sclerosis 
25 and rheumatoid arthritis. Such moledules may bind to the cell receptor, blocking the 
binding of the automimmunie antibodies which cause the degeneration seen in such 
disease states. 

United States Patent Number 4,745, 051 discloses a method for expressing 
DNA in an insect cell, a method that has practical application for the production of 
30 glycosylated peptides and proteins. However, the glycosylation resulting is that 
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native to the insect, consisting of higher levels of mannose than are typical of 
mammaUan cells. 

Practical production of peptides and polypeptides in bacterial production 
cells is well established Chemical and enzymatic means for glycosylating peptides 
5 and proteins are well known in the art. For example, United States Patent 5,370,872 
discloses a method for coupling PS through a carboxyl or hydroxyl group to a 
protein. Classic organic syntheses of complex carbohydrates have long been 
known, but with limited practical application. In addition to the difficulties inherent 
in the complexity of the glycopolymer molecule, many glycosidic bonds are labile 
10 OTxd must be protected and deprotec^f^d Huring chemical synthesis, adding to the 
difficulty of synthesis and reducing the yield of product. 

Because of the drawbacks of organic synthesis, enzymatic synthesis has been 
devised. It is known that glycosylation proceeds by the step-wise addition of 
monomers through the action of such enzymes as glycotransferases. The reaction 
15 products can be fiuther modified by lyases, acetylases, sulftases, phosphorylases, 
kinases, epimerases, methylases, transferases and the like. United States Patent 
Number 5,308,460 discloses such a step-wise synthesis on an immobilized matrix. 

A need remains for a more efficient and practical method for the production 
of complex carbohydrates, and glycoproteins and glycopeptides containing complex 
20 carbohydrates specific to a species or tissue. 

SUMMARY OF THE INVENTION 
The present invention is directed to the production of complex carbohydrates 
in a production cell. It is here disclosed that certain bacteria, such as Escherichia 
25 coli Strain K-12, have a core liposaccharide with a terminal heptose. A suitable 
production cell also contains an eMyme which catalyzes the transfer to the terminal 
heptose of an acceptor molecule, such as-Nzaaetvlg lucosamine , to fo rm a "scaffold" 
upon which glycotransferases add other saccharide monomers to form complex 
carbohydrates. If an otherwise suitable production cell lacks such an enzyme, the 
30 DNA encoding the gene rfe (UDP-GlcNAc:Undecaprenol GlcNAc-1 phosphate 
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transferase) of Haemophilus influenzae may be inserted into the production cell 
Preferably, production of rfe is enhanced by the presence of the gene products of the 
gene IsgG. By inserting genes encoding glycotransferases into the production cell, 
the complex carbohydrates specific to bacteria such as Haemophilus influenzae, 
5 l^eisseria spp, Salmonella spp and Escherichia coli are produced. Mammalian 
complex carbohydrate such as polysialyl can also be produced. 

Accordingly, the invention provides a process for the production of a 
complex carbohydrate which comprises the steps of: (a) inoculating transformed 
production cells into a culture medium capable of supporting the growth of said 

iO pruduuliuii cells wherein said production cells are prep?rRH hy transforming bacteria 
comprising (i) a core lipid structure containing a terminal heptose molecule and (ii) 
an enzyme capable of adding an acceptor molecule to said heptose molecule by 
inserting an isolated DNA sequence encoding glycotransferase synthesizes a 
complex carbohydrate into said bacteria to yield transformed production cells; (b) 

15 allowing growth of said transformed production cells; and (c) recovering said 
complex carbohydrate from the culture medium. 

The invention also provides a process for the production of an 
oligosaccharide which comprises the steps of: (a) transforming gram-negative 
bacteria comprising (i) a core lipid structure containing a terminal heptose and (ii) 

20 an enzyme that adds a galactose molecule to said heptose wherein said transformed 
gram-negative bacteria are prepared by constructing a vector comprising an isolated 
DNA sequence coding for a glycotransferase that synthesizes an oligosaccharide; 
(b) inoculating said transformed gram-negative bacteria into a culture medium 
capable of supporting the growth of said transformed bacteria; (c) allowing growth 

25 of said inoculated gram-negative bacteria; and (d) recovering said oligosaccharide 
from the culture medium. 

Using methods disclosed in this application, a production cell suitable for the 
practical production of other complex carbohydrates can be identified. Such a 
suitable production cell will have an acceptor molecule specific to the carbohydrate 

30 to,be synthesized, or a site that can be modified to add such a specific acceptor 
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molecule. The production cell will contain the initiating IsgG or IsgF to form the 
appropriate acceptor. DNA coding for the glycotransferases of other species, 
strains, tissues, hormones, receptors or other cell-surface carbohydrates can then be 
inserted into such a production cell, with the resultant production of 
5 oligosaccharides or polysaccharide specific to that species, strain, tissue, hormone, 
receptor or other cell-sijrfa^^ The nucleotide sequences for the genes 

rfc and IsG are on file in the H. influenzae Rd database_aymlable fi-om TIGR 
(Bethesda, MD). Sequences for glycotransferases are available fi-om the references 
herein disclosed. 

10 Also provided arc methods fcr separating and purifying the product. 

The invention also provides a process for the production of a complex 
carbohydrate, comprising culturing production cells comprising a chimeric DNA 
sequence encoding a glycotransferase, so as to yield production cells comprising an 
altered level of complex carbohydrate, wherein the production cells are bacteria 

1 5 comprising a core lipid structure containing a terminal heptose molecule and 
encoding an enzyme capable of adding an acceptor molecule to the heptose 
molecule. The invention also provides a process fiirther comprising recovering the 
complex carbohydrate. 

20 RRTRF nRSCRTPTTON OF THE DRAWINGS 

Figure 1 : The Isg region of Haemophilus influenzae DNA. 
(A) Diagram of the eight orfs. (B) Locations of m-Tn3(Cm) insertion sites (6). (C) 
Restriction maps of the EMBLOS-1 subclones that modified the E. coli JM 109 
LPS. 

25 Figure 2: SDS-PAGE of the LPS from coli strain JM 109 (designated 

pGEM) and the three chimeric strains, pGEMLOS-7, pGEMLOS-5, and 
pGEMLOS-4. The LPS range in molecular weight from -3.3 to 5.5 kDa. 

Figure 3: Proposed structures of the chimeric oligosaccharides. Only the 
complete E. coli K-12 core structure containing a fourth heptose on the terminus of 

30 the oligosaccharide branch undergoes modification. Additional saccharides 
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(designated R) are added to the 7-position of this heptose to form the chimeric 
oligosaccharides. 

DETAILED DESCRIPTION OF THE INVENTION 
5 The present invention provides a method by which the terminal heptose of 

the core structure of any gram bacterial species which contains the gene rfe (UDP- 
GlcNAcrUndecaprenol GlcNAc-1 phosphate transferase) (Alexander et al. (1994) J. 
Bacteriol., 176-7079-7084) can be modified so as to act as an acceptor for 
oligosaccharide synthesis. The rfe gene encodes for a protein which catalyzes the 

10 transfer of N-acetyl ghicosamine (GlcNAc. an "acceptor molecule") onto the carrier 
lipid undecaprenol phosphate. The^regulation^ofAis g ene can b e^cont^ with a 
regulatory gene^^^gG^dentified^^ The increase in rfe 

expression caused by IsgG mediates the deposition of a GlcNAc residue on the 
terminal heptose of LPS and LOS from a variety of gram-negative bacterial species 

15 including E. coli, Salmonella minnesota and H, influenzae. This GlcNAC has been 
found to ftinction as an acceptor molecule forming a scaffold for the sequential 
addition of saccharide monomers, under the direction of glycotransferases. For 
example, the galactosyltransferase gene, IsgF, results in- the addition of a galactose 
to the GlcNAc. The gene sequence coding for the /^ggly cotransfer ases of H. 

20 influenzae has been inserted into an Escherichia coli K-12 strain production cell, 
with the resultant production of^ influenzae-spQcific LOS epit^esln'S. c oli. 

Any production cell containing aiLinitiadng^^izjar^ an 
appropriate acceptor to form the scaffold. The production will preferably also 
contain the regulatory gene IsgG, DNA coding for the glycotransferases of other 

25 species, strains, tissues, hormones, receptors or other cell-surface carbohydrates can 
then be inserted into such a production cell, with the resultant production of 
oligosaccharides or polysaccharidea specific to that species, strain, tissue, hormone, 
receptor or other cell-surface carbohydrate. 

30 Definitions: 
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Complex carbohydrates: any polymer of formula (CH20)„, where n equals at least 
three monomer units, including polymers with additional substituents including but 
not limited to SO4, PO4, CO4, CH3, NH4, and such polymers linked to lipids, 
peptides and proteins. 

5 

Production cell: A production cell useful in this invention is defined as any 

bacterium which contains an LPS or LOS-saccharide inner core terminating 
in a molecule and containing an enzyme capable of adding an acceptor 
molecule to the terminal molecule to serve as a scaffold for elongation and 
in which can be trpnsfnrmefl with exogenous DNA coding for 

glycosyltransferases. Cells that are otherwise suitable but lack the proper 
acceptor molecule may be used as production cells if they are co-transformed 
with genes such as rfe and IsgG^ to appropriately modify the LPS or LOS to 
function as an acceptor molecule for the formation of a scaffold. 

15 

Hib production cell: the preferred cell for production of H. influenzae type B- 
specific OS is preferably a gram-negative bacteriuim, most preferably E. coli K-12 
strain JM 109. 

20 Synthetic gene(s): the DNA coding for the enzyme or enzymes that synthesize the 
desired complex carbohydrate. These genes include those coding for 
glycotransferases, lyases, acetylases, sulfatases, phosphorylases, kinases, 
epimerases, methylases and the like. 

25 Rxample 1 . Selection of a Hih production cell. 

Capsular strains of Haemophilus influenzae type b (Hib) are responsible for 
various invasive and bacteraemic infections in humans, including meningitis and 
pneumonia The surface lipooligosaccharides (LOS) of Hib are known to be 
important factors in microbial virulence and pathogenesis. Structural studies of Hib 

30 LOS fi-om wild-type and mutant strains have shown that the LOS contains a 
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conserved heptose trisaccharide core which can be extended with additional sugars 
on each heptose. Recently, a revised structure of the E, coli K- 12 core region was 
reported which also contains a heptose trisaccharide inner core and a fourth heptose 
present on the terminus of the main oligosaccharide branch: 
5 Gala 1 Hepa 1 

i 1 
6 7 

Hepal - 6Glcal - 2Glcal - +3Glcal - 3Hepal - 3Hepal - 5Kdo 

10 Previous v/ork shcv/cd that the core region of E. coU transformed with 

synthetic enzyme genes could be elongated by the addition of saccharide monomers 
under the direction oiK influenzae genes to produce a modified E, coli LPS that 
was elongated by approximately five monomer units. It was thought that the 
monomers were added at each of the heptoses. (Kwaik et al., Molecular 

1 5 Microbiology, 5:2475-2480 (1990).) Therefore, efforts were made to transform an 
E. coli K-12 strain termed JM109 with K influenzae synthesis genes in an attempt 
to determine whether an LOS substantially identical to that of K influenzae could 
be produced. 

Escherichia coli strains were routinely cultured at 37° C using LB agar or 
20 broth with appropriate antibiotics. Vectors used in these studies were previously 
described (Kwaik et al. (1990)). LPS from E. coli JM 109 was prepared by the 
extraction procedure of Darveau and Hancock (Darveau et al. J. Bacteriol. 155(2), 
831-838 (1983).) The LPS was separated by SDS-PAGE in resolving gels 
containing 15 % acrylamide, and visualized by silver staining. 
25 To determine the structure of this chimeric LPS, a few miUigrams of LPS 

from each sample were treated with anhydrous hydrazine at 37° C for 20 minutes, 
and then precipitated with cold acetone. 

In order to establish the chemical structure of the E, coli core and determine 
the E, coli acceptor residue, the LPS from E. coli strain JM 109 was partially 
30 characterized using composition analysis, linkage analysis, and mass spectrometry 
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as described in Example 6 below. 



7 Tsnlatinn of the T .OS synth etic genes from Hib. 

Hib strain A-2 was originally isolated from the spinal fluid of a child with 
5 meningitis. Hib A-2 was grown on chocolate agar supplemented with amino acids 
and vitamins or brain heart infixsion agar supplemented with 4% Fildes reagent 
(Difco Laboratories, Detroit) at 35° C in 5% COj atmosphere. 

A gene cluster from Hib strain A-2 containing LOS synthesis genes (Isg) was 
previously cloned (Kwaik et al, (1990). The Iss lociarecontai ^ within a 7.4 kb 

1 A T^XTA 4^r,r^ar^i ^onoicfinrr of cf^^rf^r^ r.nmnlftte nnen reading frames forfsY This 

A V/ ^ J. t. J.A V«'^^XXWXAVJ WVAAV^V»— ■ * 1 j_ w 

region is one of several distinct loci also foun d i n the g enome sequence of Hib strain 
Rd which has been associated with lipopolysaccharide (LPS) biosynthesis. 

DNA from the Isg region of HiFstrain A-2 was used to construct a genomic 
library in the lambda bacteriophage EMBL3 (Kwaik et al, 1990)). Twenty six 

1 5 phage clones were prepared which expressed Hib LOS oligosaccharide epitopes in 
E. coli strain LE392. The phage transformant designated:EMBLOS- rproduced-^ax 
chimeric LPS with a 1.4 kDa oligosaccharide added to the 4.1 kDa LPS oiE. coli 
LE392. Monoclonal antibody (MAb) 6E4, which recognizes two components in the 
Hib A2 LOS mixture, also recognized the novel 5.5 kDa component in the chimeric 

20 LPS, indicating some immunochemical similarity to Hib LOS. 

Ex^TTi ple Transformation of the Hib production cell. 

Restriction fragments of EMBLOS-I were used to make a series of plasmids 
which modified E. coli strain JM 109 to give clones which produced a proposed 

25 chimeric series of higher mass LPS species. The transformants termed 

pGEMLOS-4„pJ5EMLOS-5, and pGEMLOS-7 generated modified or chimeric LPS 
of 5.5, 5.1, and 4.5 kDa, respectively. All three apparently modified the 4.1 kDa 
LPS species from E, coli, although only the LPS from pGEMLOS-4 expressed the 
6E4 epitope. The LPS from strain pGEMLOS-5 was found to react positively with 

30 MAb 3F 11 , suggesting the presence of terminal N-acetyllactosamine. The epitope 
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recognized by MAb 6E4 is also present in the LOS ofK influenzae nontypable 
strain 2019, as well as the LPS from Salmonella minnesota Re mutant. Binding of 
this monoclonal antibody to H. influenzae LOS can be inhibited by Kdo and the Kdo 
trisaccharide from the Re mutant of 5. minnesota. Because the 6E4 epitope has been 
5 associated with the core of Haemophilus LOS, it was originally proposed that the 
chimeric structures expressed in E. coli might arise from the addition of a 
Haemophilus core structure to an acceptor residue of the E, coli 4.1 kDa LPS 
species. 

The Hib production cell was transformed with the plasmid pGEM3Zf+ into 
10 which di^f^"^f^"t DNA restriction fragments from H influenzae strain A-2 Iss: locus 
had been ligated (see Table 1 and Figure 1). The plasmid pGEMLOS-4 contained a 
7.4 kb bamhl-pstl fragment of DNA which contained all seven open reading frames 
(A-G) comprising the Isg locus. The plasmid pGEMLOS-5 contained a 5.5 kb Hind 
lll-pstl fragment of DNA comprising 5 open reading frames (C-G) of the Isg. The 
1 5 plasmid pGEMLOS-7 contained a 2.8 kb sphl-pstl fragment of DNA comprising 2 
open reading frames (F-G) of the Isg locus. The plasmid pGEM3zf+ without an 
insert was also transformed into strain JM 109. This strain and the LPS isolated 
from it were termed PGEM. 

20 Example 4. Tsolation and Purification of Oligosaccharides. 

The LPS from PGEM (31 mg), pGEMLOS-4 (25 mg), pGEMLOS-5 (15 
mg), and pGEMLOS-7 (4.4 mg) was hydrolyzed in 1% acetic acid (2 mg LPS/ml) 
for 2 hours at 100° C. The hydrolysates were centrifiiged at 5000g for 20 min at 4° 
C and the supematants removed. The pellets were washed with 2 ml of H2O and 

25 centrifixged again (5000g, 20 min, 4*" C). The supematants and washings were 
pooled and lyophilized to give the oUgosaccharide fractions. As a standard, 10 mg 
of LPS from Salmonella typhimurium 1 19 Ra mutant (Sigma, St. Louis) was 
treated in the same fashion. 

To prepare desalted oligosaccharide pools for ESI-MS analysis, small 

30 aliquots of the crude oligosaccharide fractions (<2 mg) were chromatographed on 
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two Bio-Select SEC 125-5 HPLC columns (Bio-Rad, Richmond, CA) connected in 
series, using 0.05 M pyridinium acetate (pH 5.2) at a flow rate of Iml/mia A 
refractive index detector was used to monitor column effluent and chromatograms 
were recorded and stored with an integrator. 
5 For large scale separations, the oligosaccharide fractions from PGEM (10.2 

mg), pGEMLOS-4 (9.3 mg), and pGEMLOS-5 (7.0 mg) were dissolved in 0.3 ml of 
0.05 M pyridinium acetate buffer (pH 5.2) and centrifiige-filtered through a 0.45 gm 
Nylon-66 membrane. The PGEM and pGEMLOS-4 samples were applied to a 
single Bio-Gel P-4 column (1 .6 x 84 cm, <400 mesh; Bio-Rad), and the 

in ^nji\rr r\Q « iTmo o*^t^1i*>H U\Tr\ Pio-Op.l p-4 rnlnmriQ ronnpcteH in series 

1 \J 1^ VJ Jl_/X»XJ— / V-' k_f */ k-»V«^AX|-/ X W w^^-.* — — " — - . ■ ~ — • - 

(1.6 X 79 cm and 1.6 X 76.5 cm). The columns were equipped with water jackets 
maintained at 30° C. Upward elution at a flow rate of 10 ml/h was achieved with a 
P-1 peristaltic pump (Pharmacia, Piscataway). Effluent was monitored with 
refractive index and fractions were collected at 10 minute intervals and evaporated 
1 5 to dryness in a concentrator. 

Rvample 5. nephnsphorylation of Oligosaccharides. 

Oligosaccharide fractions were placed in 1.5 ml polypropylene tubes and 
treated with cold 48% aqueous hydrogen fluoride to make 5-10 yug/ml solutions. 
20 Samples were kept for 1 8 hours at 4"* C and then aqueous HF was evaporated. The 
dephosphorylated samples were then rechromatographed on two Bio-Select SEC 
125-5 HPLC columns connected in series using 0.05 M pyridinium acetate (pH 5.2). 

Rxample 6. Charac terization of product. 
25 The reactivity with monoclonal antibodies raised to the naturally occurring 

Hib LOS, as shown in Example 2, indicated that the product had the same 
immunochemcial fimction as the naturally occurring Hib LOS. The samples were 
further analyzed by different techniques in order to determine structural identity to 
the desired complex carbohydrate. 

30 
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a Mnnosaccharide Composition Analysis. 

Dephosphorylated oligosaccharide fractions were dissolved in 400 yul of 2 M 
trifluoroacetic acid and heated for 4 hours at 100° C. The hydrolysates were 
evaporated to dryness in a Speed- Vac concentrator, redissolved in 20 /zl HjO, and 
5 dried again. Hydrolysates were analyzed by high-performance anion exchange 
chromatography with pulsed amperometrie detection using a Dionex BioLC system 
(Dionex, Svmnyvale, CA) with a CarboPac PAl column. 

h Methylation Analysis. 
10 T inVage analysis was performed on dephosphorylated oligosaccharide 

fractions using the microscale method modified for use with powdered NaOH. 
Partially methylated alditol acetates were analyzed by GC/MS in the El and CI 
modes on a mass spectrometer. 

15 c. T.iquid Secondary Ton Mass Spectrometry (LSIMS). 

LSIMS was performed using a mass spectrometer with a cesium ion source. 
Oligosaccharide samples (in 1 //I H20)were added to 1 /^l of glycerol/thioglycerol (1: 
1) on a stainless steel probe tip. A Cs+ ion primary beam energy of 10 keV was 
used and the secondary sample ions were accelerated to 8 keV. Scans were taken in 

20 the negative-ion mode at 300 s/decade and recorded with an electrostatic recorder. 
The spectra were mass calibrated manually with Ultramark 1621 (PCR Research 
Chemicals, Inc., Gainesville, FL) to an accuracy of better than ± 0.2 Da. 

d, Rlectrospray Ionization Mass Spectrometty (ESI-MSV 
25 Oligosaccharides and 0-deacylated LPS were analyzed on a mass 

spectrometer with an electrospray ion source operating in the negative-ion mode. 
Oligosaccharide samples were dissolved in HjO mixed with running solvent (1/^1 
in 4 //I), and injected into a stream of HjO/acetonitrile (1/1, v/v) containing 1% 
acetic acid, at a flow rate of approx, 20 yul/min. Mass calibration was carried out 
30 with Csl in the negative-ion mode. 
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In some cases, selected oligosaccharide fractions were analyzed at higher 
resolving power (M/A M = 2000) using a sector-orthogonal time of flight (TOF) 
instrument with an array detector operating under ESI conditions in the negative-ion 
mode. In this case, the solvent system and flow rate were essentially the same as 
described above for the quadrupole ESI experiments. A scan speed of 5 sec/decade 
was used for all samples over the m/z range of 500 to 3000 with an accelerating 
voltage of 4 kV and an ESI needle voltage of between 3.5-4 kV higher. Mass 
calibration was carried out with an external reference consisting of Csl taken under 
liquid secondary ion mass spectrometry conditions, followed by a one point 
correction of the dov^^y ^"harqed deprotonated molecular ion of the oligosaccharide 
from the LPS of Salmonella typhimurium Ra mutant ((M-2H)^- exact = 973.2)) in 
the negative-ion ES-MS mode. 

e. Matrix Assisted T.a.qer Desorption Tnnization nVfALPn Mass Spectrometrv. 

0-Deacylated LPS samples were analyzed on a Voyager or an Elite 
MALDI-TOF instrument (PerSeptive Biosystems, Framingham, MA) equipped with 
a nitrogen laser (337 nM). All spectra were recorded in the negative-ion mode using 
delayed extraction conditions as described in detail elsewhere. (Gibson et al. L 
Amer. Soc. Mass Spec. 8:645-658 (1997)). Samples were dissolved in H2O (approx. 
250 pmol///l), and mixed 1:1 with the matrix solution (a saturated solution of 
2,5-dihydroxybenzoic acid in acetone) and allowed to dry at room temperature on a 
gold-plated MALDI plate. Approximately 100 laser shots were recorded for each 
sample, averaged and then mass calibrated using an external mass calibrant 
consisting of renin substrate tetradecapeptide, insulin chain B, oxidized, and bovine 
insulin (all from Sigma). For external calibrations under these conditions, a mass 
accuracy of 0.1% was obtained. For comparison purposes, a single point correction 
was made to the spectra of the 0-deacylated LPS from PGEM using the expected 
lipid A fragment ion ((M-H) average = 952.009), and then the spectra for the three 
chimeric strains were recalibrated using this lipid A fragment ion and an additional 
ion from PGEM (m/z 2835.7) present in all four samples. 
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f TflTiHfim Mass Spectrometry fMS/MS^ Using Ouadrunole-TOF (qTOV \ 

Dephosphorylated oligosaccharides were analyzed in the positive-ion mode 
on a mass spectrometer equipped with a nanospray ion source. The analyzer 
consists of a high pressure RF-only ion guide followed by a quadrupole mass filter. 
5 A high pressure quadrupole collision cell follows the first mass filter. The TOF 
mass analyzer is comprised of a reflection with an effective flight path of 2.5 meters. 
Samples were dissolved in H20/acetomtrile (1/1, v/v) containing 1% acetic acid, and 
2 fA of each was injected into a nanospray tip. The nanospray needle voltage was 
typically 800-1000 V. One sample loading usually gave an analysis time of 30-40 

10 min, which alloweH a conventional mass soectrum to be obtained prior to the 
selection of several individual ions for CID MS/MS. In MS mode the high 
resolution capability (8,000 FWHM) allowed unambiguous determination of the 
charge state for each ion. For CID-MS/MS operation the quadropole mass analyzer 
with a mass window of I m/z unit was used to select precursor ions for 

1 5 firagmentation, which in most cases were doubly charged (M+2H)^^. The selected 
ions were fi-agmented in a collision cell with air as the collision gas and analyzed in 
the orthogonal TOF operating at an accelerating potential of 20 kV. Fragment ion 
spectra were accumulated under computer control for periods of between 10 seconds 
and 1 minute. Mass assignments based on external calibration were generally 

20 within 50 ppm of calculated monoisotopic values whereas internal calibration gave 
masses accurate to +5 ppm, 

g SmS^-PAGR Analysis of T.PS. 

We have previously reported the transformation ofE. coli strain JM 109 (a 

25 K- 12 strain which produces rough LPS (r-LPS) which lack the 0-side-chain) with a 
series of plasmids containing overlapping restriction fi*agments of DNA from the Isg 
region of//, influenzae type b strain A-2 ( Kwaik et al, 1990). Partial LOS 
segments were produced. As diagrammed in Figure 1, the pGEMLOS-4 clone 
contains all of the complete orfs (orfs A-G) in the Isg region, whereas pGEMLOS-5 

30 contains orfs C-G, and pGEMLOS-7 contains orfs F-G. The clones pGEMLOS-4, 
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pGEMLOS-5, and pGEMLOS-7 were shown by SDS-PAGE to produce modified 
LPS structures which added 1.4, 1.0, and 0.4 kDa moieties, respectively, to the 4.1 
kDa E, coll core (Figure 2). 

5 h Analysis of 0-nfiflry1ated TPS hy MAT.DT-TQF. 

For initial screening of LPS molecular weights and heterogeneity by mass 
spectrometry, small aliquots of LPS from POEM, pGEMLOS-4, pGEMLOS-5, and 
pGEMLOS-7 were treated with anhydrous hydrazine to remove 0-linked fatty acids 
from the lipid A moiety. The PGEM 0-deacylated LPS sample contains several 

IG 3pccic3 in the range of 2738-? D?. rp.nresent.ing the major E, coli core structures. 
When fit to proposed compositions, the observed species were found to exhibit 
heterogeneity in heptose (Hep), hexose (Hex), 3-deoxy-D-mannooctulosomc acid 
(Kdo), phosphate (Phos), and phosphoethanolamine (PEA) (Table 2). Specifically, 
two main core types were observed containing either 3 Hex and 3 Hep (with 2 or 3 

15 Kdos) or 4 Hex and 4 Hep (with 2 Kdos), with variable amounts of phosphate and 
PEA in both. The pGEMLOS-7 0-deacylated LPS mixture contained many of these 
same species, in addition to two major new species at (M-H)- 3334,5 and 3456.8. 
The m/z 3334.5 species apparently arises from the addition of Hex and 
N-acetylhexosamine (HexNAc) to the PGEM core structure containing 4 Hex, 4 

20 Hep, 2 Kdo, 2 Phos, and 1 0-deacylated diphosphorylated Lipid A (0-DPLA) 
moiety. A fiirther addition of 1 PEA moiety gives the m/z 3456.8 species. These 
data suggest that the transformation producing pGEMLOS-7 results in the addition 
of a Hex-HexNAc moiety to the E, coli LPS. Likewise, the main species in the 
pGEMLOS-5 O-deacylated LPS (m/z 3700.6 and 3823.6) were found to arise from 

25 the addition of 2 Hex plus 2 HexNAc to the PGEM core structure containing 4 Hex, 
4 Hep, 2 Kdo, 2 Phos, 1 0-DPLA, and 0 or 1 PEA (see Table 2). These structures 
are also found in the pGEMLOS-4 O-deacylated LPS, in addition to new species 
arising from the fiirther addition of either another Hex (m/z 4083.2 and 4206.4) or 
HexNAc (m/z 4124.5 and 4246.8) to, in this case, the PGEM core structure 

30 containing 4 Hex, 4 Hep, 3 Kdo, 2 Phos, 1 O-DPLA, and 0 or 1 PEA (see Table 2). 
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Of the chimeric LPS structures, only these high molecular weight pGEMLOS-4 
components contained the third Kdo moiety. 

i A^^l ysis of Oligosaccharides hy RST-MS and LSIMS. 
5 The LPS from PGEM, pGEMLOS-4, pGEMLOS-5, and pGEMLOS-7 were 

subjected to mild acid hydrolysis to liberate free oligosaccharides. Initially, small 
aliquots of the oligosaccharide fractions were desalted by size exclusion HPLC and 
analyzed as mixtures by negative-ion ESI-MS. The ESI-MS spectra contained 
predominantly doubly charged ions, (M-2H)^'. In general, the data were consistent 

1 0 with results fr^ni the M ALDT-TOF analysis of O-deacylated LPS. The PGEM 
sample was found to contain seven major oligosaccharides and several minor 
species, ranging in molecular weight from 1459.3 to 2016.7 Da As shown in Table 
3, proposed compositions were determined for the various species which indicated 
that the structures consisted of two main core types; one containing 3 Hex, 3 Hep, 

1 5 and IKdo, and another containing 4 Hex, 4 Hep, and IKdo. VariabiUty in the 
number of phosphate and PEA groups was responsible for the large number of 
species present in the mixture. 

The pGEMLOS-4, pGEMLOS-5, and pGEMLOS-7 samples contained many 
of the species found in the PGEM sample, in addition to larger molecular weight 

20 oligosaccharides (Table 3). New LPS glycoforms of M, 2177.7 and 2302.5 were 
observed in the pGEMLOS-7 sample, consistent with the addition of a single Hex 
and HexNAc residue to the PGEM core structure containing 4 Hex, 4 Hep, 1 Kdo, 2 
Phos, and 0 or 1 PEA. The high molecular weight components of the pGEMLOS-5 
sample (M, 2543.9 and 2666.5) suggested the fiirther addition of yet another 

25 Hex-HexNAc unit, and the pGEMLOS-4 sample contained even higher molecular 
weight materials (ranging from Mr 2706.1 to 2870.0) consistent with the addition of 
one more Hex or HexNAc moiety. 

To aid in the determination of proposed compositions for these species, 
oligosaccharides from the PGEM, pGEMLOS-4, pGEMLOS-5, and pGEMLOS-7 

30 samples were separated by size exclusion chromatography and fractions were 
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analyzed by LSMS and/or ESI-MS. Selected fractions representing the two major 
PGEM core types and the various chimeric structures were then pooled, 
dephosphorylated with aqueous HF, rechromatographed on size-exclusion HPLC, 
and analyzed again by negative-ion LSIMS or ESI-MS. Proposed compositions for 
5 the molecular ions observed after HF-treatment are listed in Table 4. Upon removal 
of phosphate and PEA moieties, the major high mass species present in the 
pGEMLOS-7 sample is an oligosaccharide of (avg.) 2020.3 (IHexNAc, 5 Hex, 4 
Hep, and IKdo). The pGEMLOS-5 sample contains an oligosaccharide of M, (avg.) 
2386.3, resulting from the further addition of 1 Hex and IHexNAc to the 
1 0 nGKMLOS-V LPS (2 HexNAc, 6 Hex. 4 Hen, and 1 Kdo). This species is also 
present in the pGEMLOS-4 sample, in addition to higher molecular weight 
structures containing an additional Hex (M, (avg.) 2548.4) or HexNAc (M, (avg.) 
2589.5). 

15 j. Monosaccharide Composition and Linkage Analyses. 

Mass spectrometric analyses of the free oligosaccharides from PGEM, 
pGEML0S4, pGEMLOS-5, and pGEMLOS-7 indicated that the different chimeric 
structures arise from additions of stoichiometric amounts of hexose and HexNAc 
residues to a variably phosphorylated PGEM core structure containing 4 Hex, 4 

20 Hep, and IKdo. No chimeric structures were observed to contain the 3 Hex, 3 Hep, 
and IKdo core. 

For comparison purposes, dephosphorylated oligosaccharide fractions 
containing the two PGEM core types and the main chimeric structures from 
pGEMLOS-4, pGEMLOS-5, and pGEMLOS-7 were hydrolyzed in 2N 

25 trifluoroacetic acid to determine their monosaccharide compositions, and therefore 
the identities of the Hex and HexNAc residues. When analyzed by high pH anion 
exchange chromatography with pulsed amperometric detection, the PGEM 
hydrolysates were found to contain only galactose, glucose, and 
L-glycero-D-manno-heptose (Table 5). (The Kdo residue is not recovered under 

30 these hydrolysis conditions.) The two core types were identified as GalGlc2Hep3 and 
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GalGc3Hep4. The pGEMLOS-7 sample contained GlcNH2Gal2Glc3Hep4 (Table 5), 
suggesting that the larger PGEM core was being modified by the addition of one Gal 
and one N-acetylglucosamine (GlcNAc) residue. Likewise, the composition of the 
pGEMLOS-5 sample suggested that the larger PGEM core was being further 

5 glycosylated with only Gal and GlcNAc residues. Fraction 2 from pGEMLOS-4, 
which contained the same species as pGEMLOS-5, gave similar results, and fraction 
1 from pGEMLOS-4, which contains three main species (see Table 4), contained 
sUghtly moreGlcNH2. 

Aliquots of the same six dephosphorylated oligosaccharide fractions used for 
ctrrh^r^Ae. romnosition analvsis were taken for methylation analysis to 
establish sugar linkage positions. The partially methylated alditol acetates observed 
by GC/MS are listed in Table 6. Again, by comparing the two PGEM core types, it 
is relatively straightforward to see that the second terminal heptose of the larger 
PGEM core is converted to a 1,7-1 inked heptose in all of the chimeric structures and 

1 5 thus must represent the linkage site for the novel glycosylation. Since no chimeric 
structures were observed with the Hep3 core, it is most likely that the nonreducing 
terminal heptose recently identified on the oligosaccharide branch in the K- 12 core 
structure is the modified terminal heptose. Additionally, no new trilinked 
saccharides were obtained from the chimeric oligosaccharides, suggesting that the 

20 sugars were most likely all added in a straight chain. 

]c Re^quencmg of Chimeric Oligosaccharides bv MS /MS. 

To confinn the identity of the linkage site between the E. coli LPS core and 
the novel oUgosaccharide moieties, and to determine the sequences of the added 

25 sugars, the dephosphorylated oligosaccharides were subjected to MS/MS analysis. 
For these experiments, samples were run in the positive-ion mode and doubly 
charged molecular ions, (M+2H)^'', were selected for coUision-induced dissociation 
(CID). Various reducing-terminal (Y-type) and non-reducing terminal (B-type) 
sequence ions are present in the spectra For the PGEM oligosaccharide, the Y ion 

30 series including the Y^^, (m/z 732.2 (2+)), Y„5, (m/z 65 1 .2 (2+)), and Y4,, (m/z 
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1 139.3) fragment ions, and the corresponding B ion series including the 63^, (m/z 
517.2), 84^, (m/z 841.3), B5 (m/z 1225.4), and (m/z 1417.4) fragment ions, 
support the published structure with the fourth heptose on the non-reducing terminus 
of the largest oligosaccharide branch. In addition to these sequence ions, several 
5 ions present in the spectrum apparently arise from internal cleavages, which can 
occur under high energy CJD conditions. In the spectrum of the pGEMLOS-5 
oligosaccharide, two similar Y and B-type ion series clearly define the sequence and 
linkage site of the added tetrasaccharide. Intense B ions at m/z 366.1 (B2„/) and 
731,3 (84^,) arise from the sequential cleavage of two Hex-HexNAc moieties. 

iO These lusses ai-e also represented by the correspondirs V,^; (m/z 2020.6) and 

(m/z 1655.5) fragment ions. Fragment ions at m/z 923.3 (B 5„,)0 and m/z 1463.4 (Y 
6«,) confirm that the Hex-HexNAc-Hex-HexNAc moiety is linked to a heptose, and 
additional cleavages further along the large oligosaccharide branch confirm that the 
novel tetrasaccharide is attached to the largest branch of the PGEM core structure. 

1 5 In the MS/MS spectra of the chimeric oligosaccharides from pGEMLOS-7 

and pGEMLOS-4, intense B ions also clearly defined the structures of the added 
sugar moiefies. In the pGEMLOS-7 oUgosaccharide (M, 2019.7), a B ion at m/z 
366.1 corresponds to a single Hex-HexNAc disaccharide moiety. The pGEMLOS-4 
oligosaccharide of 2587.9 (HexNAc3Hex6Hep4Kdo) lost a 

20 HexNAc-Hex-HexNAc fragment (m/z 569.2) and a 

HexNAc-Hex-HexNAc-Hex-HexNAc fragment (m/z 934.3), whereas the 
pGEMLOS-4 oligosaccharide of 2546.8 (HexNAc2Hex7Hep4-do) lost a 
Hex-Hex-HexNAc (m/z 528.2) and a Hex-Hex-HexNAc-Hex-HexNAc (m/z 893.3,) 
fragment. In addition to those B-type ions, the latter spectrum also contained large 

25 ions at m/z 366. 1 and 73 1 .3, which apparently arise as internal fragments in that 
case. 

Assuming that the oligosaccharides are built up sequentially, i.e., from ^. 
pGEMLOS-7 to pGEMLOS-5 to pGEMLOS-4, the MS/MS data, in combination 
with our methylation analysis results, allows the partial structures of the chimeric 
30 ohgosaccharides to be deduced as shown in Figure 3, 
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The structural data support the prediction that E, coli K- 12 transformed with 
plasmids containing portions of an eight gene segment from H. influenzae involved 
in LOS biosynthesis makes chimeric LPS which can be modified to produce 
ohgosaccharide essentially identical to that of K influenzae. Moreover, we have 

5 shown that the chimeric LPS are segregated hybrid-type structures, where the E. coli 
R-LPS core structure is first synthesized and then serves as a scaffold for K 
influenzae LOS biosynthesis enzymes to add a second independent set of sugars not 
found in the parent E, coli strain. Thus, the biosynthetic pathways appear to be 
sequential (segregated) and not intermixed. 

JO Before this invention was rnaHe. the role of the terminal branch heptose in 

the E, coli R-LPS as the acceptor for oligosaccharide elongation or the requirement 
for a fimcitonal initiator enzyme was unknown. The published structure for the 
complete E. coli K- 12 core did not contain a second terminal heptose, but rather 
had this fourth heptose as part of the inner core region. The oligosaccharide branch 

15 was believed to terminate in glucose, which was proposed to be the acceptor site for 
0-antigen and other substituents. The role of the initiator enzyme was unknown. It 
is now apparent that only E. coli R-LPS structures containing this fourth heptose 
(i.e., complete core structures) underwent elongation in the plasmid-transformed 
chimeric strains and thus, only those E, coli having this composition are useful as 

20 production cells for the production of//, influenzae. In the chimeric structures, 
GlcNAc is the first sugar added to the seven position of this heptose. There are two 
possible explanations for this crucial first step in the elongation sequence. One, an 
N-acetylglucosamine-specific glycosyltransferase from Haemophilus encoded in 
orff or orfg either has this precise specificity or is promiscuous enough to allow this 

25 reaction to occur. Two, some analogous E, coli glycosyltransferase gene is being 
activated by a Haemophilus regulatory gene. Since sequence comparisons of the 
seven genes contained in this plasmid suggest that both glycosyltransferase and 
regulatory genes are present, both explanations are possible. However, the fact that 
terminal GlcNAc and 1,7- linked heptose have been found in non-stoichiometric 

30 amounts in some other strains of E. coli K- 12 suggests that the addition of this 
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sugar to the terminus of the oUgosaccharide branch is accomplished by E, coli 
enzymes. It was recently reported that when mutations causing the rough phenotype 
in E. coli K- 12 are complemented, the complemented strains produce an 0-antigen 
which has GIcNAc at the reducing terminus of the repeat unit. Regardless of the 
5 mechanism of this first key step in the extension of the PGEM core, it appears that 
addition of this GlcNAc is rate-limiting, since a large percentage of unmodified 
PGEM core R-LPS remains in the chimeric mixtures. Furthermore, very little or no 
intermediate structures are observed as one progresses from pGEMLOS-7 to 
pGEMLOS-5 and pGEMLOS-4, suggesting that once this first GlcNAc is added, the 

10 addition of the other ifr?^w<Tn'''"'^'^-^elat.ed sugars proceeds quickly to defined end 
points. If other steps in the biosynthesis of the chimeric LPS were as incomplete as 
the addition of this first GlcNAc, one would expect to see these other intermediate 
structures, yet none were observed. Therefore, it is likely that a N- 
acetylglucosaminyltransferase from E. coli that is regulated through the product of 

1 5 orff or orfg adds this first key sugar in the chimeric structures. Preliminary data on 
a chimeric construct containing orfg alone showed a mass shift of 203 Da 
(HexNAc), suggesting that orfg encodes this regulatory gene. 

The second step in the biosynthesis of the chimeric LPS is the addition of 
galactose to the 3-position of the terminal GlcNAc. The resulting disaccharide, 

20 GallBGlcNAc, is the structural moiety observed in pGEMLOS-7, which arises when 
the transforming plasmid contains orfs F-G from Haemophilus, Examination of the 
predicted amino acid sequences of the gene products indicates that orff has high 
homology (66% identity) to a galactosyltransferase (asmE) from Erwinia 
amylovora, suggesting that it may encode a galactosyltransferase in Haemophilus. 

25 OrfG does not show any homology to known oUgosaccharide biosynthetic genes, 
but is homologous (64% identity) to a gene encoding the ModE protein in £. colL 
This protein is involved in molybdenum transport and regulation of transcription, 
suggesting that orfG may encode a regulatory protein from Haemophilus (which 
may be regulating an N-acetylglucosaminyltransferase gene from E, coli in the 

3 0 chimeric strains) . 
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In the pGEMLOS-5 strain, an additional three genes are contained in the 
transforming plasmid (orfs C-G) and an additional GlcNAc and Gal are observed in 
the resulting LPS. These sugars now define the tetrasaccharide 
Gall-4GIcNAc-3Gall3GlcNAc. The LPS fi-om this transformant is now reactive to 
5 the 3F1 1 MAb, suggesting that this new disaccharide is betaUnked to form the 
terminal trisaccharide, Gall4GlcNAcl-3Gal. All of the new orfs contained in this 
plasmid have some homology with known glycosyltransferase genes: orfC has 
homology with the asniD (26% identity) fi-om Erwinia amylovora, which encodes a 
glycosyltransferase for exopolysaccharide synthesis, and TrsD (38% identity) fi*om 

10 Yersina entercolitic^, ^ s^ne involved in LPS inner core synthesis, orfD has 
homology with the sialyltransferase gene (1st) (27% identity) from Neisseria 
gonorrhoeae, and orfE has homology with a putative glycosyltransferase gene (77% 
identity) fi-om Actinobacillus sp. and the galactosyltransferase gene, amsB {11% 
identity) from Erwinia amylovora. The fact that these three additional orfs in the 

1 5 transforming plasmid apparently result in the addition of only two more sugars to 
the growing oligosaccharide chain may indicate that the acceptor for one of the 
glycosyltransferases is absent in the chimeric LPS. 

When two more orfs are added in the transforming plasmid (orfs A-G) to 
form the pGEMLOS-4 chimeric strain, we observe that the 3F1 1 epitope disappears 

20 and the terminal Gal residue of the epitope is capped by either a second Gal or a 
GlcNAc moiety, apparently linked to the 6-position of the Gal. These new species 
present in the pGEMLOS-4 LPS population were also observed to contain a third 
Kdo moiety, presumably somewhere in their core regions. While some of the 
incomplete core structures found in the wild-type £. coli K- 12 LPS populations also 

25 contain a third Kdo, of the chimeric structures, only the structures unique to 
pGEMLOS-4 were found to contain a third Kdo. This chimeric strain was also 
recognized by MAb 6E4 which recognizes an inner core, Kdo-related epitope in H. 
influenzae, suggesting that this third Kdo forms a different epitope than the one 
found in the core structure of the wild-type E. coli LPS. Thus, the addition of orfs A 

30 and B to the transforming plasmid which fon-ned strain pGEMLOS-4 seems to have 
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multiple effects on the chimeric LPS structure. OrfB is homologous (46% identity) 
to the sialyltransferase gene from A^. gonnorhea. OriA is homologous to both the 
Rfb X gene product (22 % identity) from E, coli and TrsA (24 % identity) of K 
entercolitica. These are putative 0-antigen transporters (36,37), suggesting that 
5 orfA may encode a flippase. 

While sialyl-N-acetyllactosamine-containing structures are only minor 
components of the wild-type H, influenzae type b strain A2 LOS population, we 
have previously seen that Isg genes are involved in the synthesis of this epitope. 
Transposon mutagenesis of orfD produced mutant strain 28L25, that lost all ability 

10 to ?-dd g5*l ?f"tosft to Hih T .OS glycoforms. This strain could not make anv of the 
wild-type LOS structures larger than the major species containing four glucoses and 
three heptoses. Mutation of orfE (which is downstream of orfD) produced strain 
276.4 which had essentially the same defect, except for one important difference: 
strain 276.4 retained the ability to make the sialyl-N-acetyllactosamine epitope. 

15 These results suggest that in the transposon mutants, the knockout of orfD has a 
polar effect on orfE, which would imply that the gene product of orfE is a 
galactosyltransferase required for synthesis of the higher molecular weight wild-type 
structures containing terminal galactose(s) on their glucose disaccharide branches 
and the gene product of orfD is likely an A^-acetylglucosaminyltransferase required 

20 for the synthesis of the sialyl-N-acetyllactosamine epitope. The case for these 
assignments can be made on the basis of the homologies noted above (orfE is 
homologous to a galactosyltransferase gene) and the LOS glycoforms observed in 
the 276.4 and 28L25 mutant strains. Since no truncated versions of the sialyl-N- 
acetyllactosamine structure were seen in the 276.4 LOS population (i.e.; no species 

25 lacking either sialic acid or sialic acid plus galactose), it seems probable that the 
orfd gene codes for the glycosyltransferase which adds the GlcNAc to the 
oligosaccharide branch. This is also consistent with the observation that one of the 
genes in orfd C-E is apparently responsible for adding GlcNAc to the 3-position of 
the Gal which is terminal in the pGEMLOS-7 LPS structure. 

30 This chimeric carbohydrate expression system has provided information that 



24 



is relevant to unraveling the functions of these Isg genes and has the additional 
advantage of being carried out in the absence of the normal endogenous genetic 
background on K influenzae. Indeed, while gene knockouts of some of the Isg 
genes in K influenzae have been completed, downstream or regulatory gene effects 
can often complicate their functional analysis. In this E. coli expression system, 
structural analysis of the resulting chimeric LPS has shown that synthesis proceeded 
as a serial (non-parallel) synthesis, that is, the new elements of the chimeric LPS 
were added after the formation of the E. coli R-LPS. The fact that this synthesis was 
sequential (rather than interdigitated with the R-LPS synthesis, for example) 
alluwcd iOi the foTxCtions of thece H. infAieniae gene proHncts to be more readily 
delineated from the chimeric oligosaccharide structures. Moreover, screening of the 
chimeric LPS products with monoclonal antibodies enabled us to follow the 
formation of terminal sugar sequences (epitopes) that are unique to the Haemophilus 
strain from which the plasmid DNA originated. 

All publications and patents cited herein are incorporated by reference as 
though fully set forth. 

This invention has been described with respect to specific examples and 
embodiments. However, it is understood that one skilled in the art may make 
variations or modifications that are within the spirit and scope of the invention. 
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Table 1. Bacterial Strains, LPS and Vectors 



Strain/ 
Plasmid 



Relevant characteristics 



Reference/source 



E. coli 
JM109 

H. influenzae 
A2 



pGEM3Zf+ 
pGEMLOS-4 
pGEMLOS-5 
pGEMLOS-7 



recA, supE, hsR, 
A{lac-pro) 



Parental strain 



Ap 



(40) 



(10) 



Promega Biotech 



Ap*^, contains 7.4 kb 6amHI-pjrt This study 
DNA H. influenzae Isg locus 

Ap*^, contains 5.5 kb /i//i</III-p^/I This study 
DNA H. influenzae Isg locus 

Ap^ contains 2.% kh sphl-psll This study 
DNA H. influenzae Isg locus 



LPS 
pGEM 

pGEMLOS-4 
pGEMLOS-5 
pGEMLOS-7 



isolated from strain JM 1 09 This study 

transformed with pGEM3zf+ 

isolated from strain JM 1 09 This study 

transformed with the plasmid 
pGEMLOS-4 

isolated from strain JM 1 09 This study 

transformed with the plasmid 
pGEMLOS-5 

isolated from strain JM 1 09 This study 

transformed with the plasmid 
pGEMLOS-7 
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Table 4. Molecular weights (A/f) and proposed compositions of the dephosphorylated 
oligosaccharides from pGEM, pGEMLOS-7, pGEMLOS-5, and pGElVIL0S-4« 



r Taction 


(avg.) 


Gale. Mr 
(avR.) (exact) 


Proposed compositions 


pGEM (Fr. 2) 


1300.8 


1301.1 


1300.4 


3Hex 3Hep IKdo 


pGEM (Fr. 1) 


1655.3 


1655.4 


1654.5 


4Hex 4Hcp IKdo 


pGEMLOS-7 


2020.3 


2020.8 


2019.7 


IHexNAc 5Hex 4Hep IKdo 


pGEMLOS-5 


^ /' 


^^OC 1 


2384.8 


2HeTfNAc 6Hex 4Hep IKdo 


pGEMLOS-4 (Fr. 2) 


2386.3 


2386.1 


2384.8 


2HexNAc 6Hex 4Hep IKdo 


pGEMLOS-4 (Fr. 1) 


2386.1 


2386.1 


2384.8 


2HexNAc 6Hex 4Hep IKdo 




2548.4 


2548.2 


2546.8 


2HexNAc 7Hex 4Hep IKdo 




2589.5 


2589.3 


2587.9 


3HcxNAc 6Hcx 4Hep IKdo 



^ Observed molecular weights are reported as average mass values. 



4 t 

Tabic 5. Monosaccharide compositions of tlie depliosphoryiated oligosaccliaride fractions'' 





pGEM 


pGEM 


pGEM- 


pGEM- 


pGEM- 


pGEM- i 


I—--.- 






_L0S-7_ 




(Fr. 2) 


LOS-4 i 
















l,„„GIcN„ 




—• ■ — 


_i.g 


2.7 


2.4 


.,....™.3:L.„.™i 
















[ _Gal 


p.? 


1.1 


1.7 


3.0 


2.8 




* 














Glc 


2.2 


3.2 


3.2 






J;H • 


[ZiiZ 












4.0 



^ Molar ratios were derived from comparison to the hydrolysate of the 5. typhimurium Ra 
oligosaccharide of known composition, and then those values were normalized to either 3.0 or 
4.0 heptoses in each fraction. 



Table 6. Me 



thylation an^^^ 



of the dephosphorylated oligosao^ride fractions^ 







(Fr 1\ 


^ LOS-7 


LOS-5 


pGEM- 
LOS-4 


pGEM- i 
LOS^ i 






























T-Glc 


— — 
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t-Gai 
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.... — — 
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— ol"" 






























T-GlcNAc 






V. 1 


02 


■■" 02 


0:3 






























1,4-GlcNAc 








0:9 


T.'i 


i.o 






























TjGkNAr'' 










"T.? — 


- t:o — 

















o 

ill 

a 
□ 

m 

m 
o 
o 



^ Peak areas were measured from the GCTMS EI total ion chromatograms, and values were 
normalized to the 1,3,6-glc residue. The data for pGEMLOS-7 are the average of two runs. 
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